NetNews Offline 2

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Offline 2 / NetNews Offline Volume 2.iso / news / comp / lang / c++-part2 / 16321 < prev next >

Wrap

Text File | 1996-08-05 | 8.7 KB | 223 lines

Path: svnews.ubinet.ubs.com!ubszh!ian.johnston@ubs.com From: ian.johnston@ubs.com (Ian Johnston (by ubsswop)) Newsgroups: comp.lang.c++ Subject: Re: Would/Won't you use a garbage collector? Date: 10 Apr 1996 11:34:48 GMT Organization: UBS Distribution: world Message-ID: <4kg6co$2h3@ubszh.fh.zh.ubs.com> References: <4kamie$e4d@dfw-ixnews3.ix.netcom.com> NNTP-Posting-Host: nol2179.fh.zh.ubs.com In article <4kamie$e4d@dfw-ixnews3.ix.netcom.com>, giuliano@ix.netcom.com(Giuliano Carlini) writes: |> I'm a long time proponent of using garbage collection in C and C++ |> programs, and I'm curious: |> - How many others are there? |> - Why don't most C/C++ programmers use it? |> I'm particularly interested in finding out why most C/C++ don't use it. |> While I have my own theories - which I'll describe below - I'm |> interested |> in finding out more directly from those who are against it. [...] |> What follows is my belief for why garbage collection is so little used. |> Feel free to respond to anything I say below, but please, first respond |> to the questions above. I believe that most people don't use garbage |> collection because either they: |> - don't know what it is |> - don't know it can be used with C/C++. |> - are misinformation |> - are biased against it by the C/C++ culture |> In my experience, most C/C++ programmers either don't know what garbage |> collection is, or don't know that it can be used with C/C++. After all, |> no major C/C++ compiler includes a garbage collector. At least, as far |> as I know. I hope I'm wrong, and that someone can correct me. But even |> after, I tell them what it is, and that it can be used with C++, almost |> everyone still rejects it. |> |> At first, most offer technical reasons for rejecting it. Almost all are |> based on misinformation, since garbage collection is usable and |> benificial |> for the vast majority of systems. I think the reasons you give are correct. I also agree that for many (simple) systems, garbage collection (GC) is a good idea. I would have no objection to GC being the default for C++, provided that I could switch it off and incur *minimal penalty* by overriding the GC. That is, minimal performance penalty induced by the garbage collector, even though it is not used. A performance penalty equivalent to virtual vs non-virtual function calls would be acceptable. Remember too, that GC in C or C++ is *hard*. C has been around for 25 years or so, and C++ for about 15; it is only relatively recently that efficient garbage collectors have appeared for C and/or C++. It is not so much that the culture set out biased against GC, but the culture has probably grown that way, for the first of the three reasons you give above. Now, do I use GC? Not in the systems I write for a living. Here's why. First, I write servers that are intended to run for a long time, perhaps 3 months, perhaps 6 months, perhaps a year. These servers are constantly allocating and freeing objects. To use GC, I need a collector that collects 100% of the dead objects: not 95%, not 99%, not even 99.9%. I don't know of any collector for C++ that can give this guarantee. Let's say a server averages 100 object creations per second. In a 10 hour day, that's 3.6 million objects. Say a collector leaks 0.1% of all objects. That's 3600 objects per day. At an average of 2k per object, that's 7.3MB of memory leaked per day. Second, I write multi-threaded programs. It's not clear to me how GC works in a multi-threaded environment. Can current collectors handle one thread allocating, and a different thread freeing? Or will the apparently dead object in the allocator thread be collected, even though it is still in use by other threads? The answer has to be "no" if GC is to work in a multi-threaded environment. Third, in the code I write, I use a variety of helper classes to help manage memory (and other resources; see below). I don't tend to use C-style arrays (stack or heap-based). I don't tend to use raw C++ pointers. These things dramatically reduce the potential for bugs. In addition, I make frequent use of customised memory allocators to increase performance of allocating/releasing space for objects. Sometimes I use shared memory. Sometimes I use statically allocated memory. Sometimes I use heap memory. Unfortunately, there are leaks in the C libraries I use. If a 100% reliable GC could somehow be confined to the library, that would make life easier. As it is, I have to pick and choose the C library routines I use. Sadly, some can't be avoided. This is an argument for GC, rather than against :-) Fourth, the code I write lives in a mixed-language environment. My C++ libraries are linked with main programs written in C or Ada. While GC might survive across a C/C++ boundary, it is not clear to me that it would survive across an Ada/C++ boundary. It has been a significant effort to craft my C++ libraries so that they can run reliably without relying on static constructors being called! These are the practical reasons. There is another, major reason which is partly practical and partly philosphical. As you point out, memory is a resource. But it is not the only resource my software uses. There are other resources that are in relatively short supply: file handles, network connections, semaphores, even threads in some cases. I don't understand why GC should be applied only to memory. If it is important to automatically reclaim unused allocated memory, why is it not important to automatically reclaim unused file handles, or unused network connections? An important, and extremely useful idiom, in C++, is the technique of acquiring a resource in a constructor, and releasing the resource in the destructor. I have come to use this idiom very heavily, and it has made my code much simpler, much less prone to mistakes, much more robust in the presence of exceptions, and much more maintainable. Here's an example: class AutoLock { public: AutoLock(Mutex &m) : mtx(m) { mtx.lock(); } ~AutoLock() { mtx.unlock(); } private: Mutex &mtx; }; Locking and unlocking a mutex at precisely the right times is critical to maximising concurrency and robustness in multi-threaded applications. If somehow the destruction of this AutoLock were left to a GC system, I would lose control over unlocking the mutex; this would be a disaster for concurrency. I simply cannot afford to let the system decide, at some point in the future, to release the mutex. I could of course, replace this: void someFunc() { AutoLock lock(someMutex); manipulate(someObject); } with this: void someFunc() { someMutex.lock(); try { manipulate(someObject); } catch(...) { someMutex.unlock(); throw; } someMutex.unlock(); } Having this discussion some time back on comp.lang.eiffel, people there actually proposed that this second version (written in Eiffel) was the way to go. (Eiffel, of course, doesn't even have a finalisation mechanism, so there is no way to write the first version in Eiffel anyway. At least Java has finalisation, but no guarantees when it will be called [or even whether it will be called, if I remember rightly].) Frankly, I am not prepared to forgo the first version without some extremely convincing arguments. If I wrote GUI programs, I could make the same arguments about windows. When I click "OK" in a dialog box, I expect the dialog box to disappear. I don't expect the dialog box to hang around on screen until the GC kicks in and reclaims the dialog box (i.e. calls its finalisation routine/destructor). Of course, GC proponents will say that you can just call the finalisation routine explicitly, thereby closing the dialog box or releasing the mutex. But that doesn't gain anything. If I need to do that, I might just as well delete my allocated objects explicitly. In the worst of all possible scenarios, you might have to call finalisation routines explicitly for some objects (windows, mutexes), but not for others (allocated memory). This opens up all sorts of possibilities for resource leaks and/or errors. What is needed is either: - Everything is collected, not just memory resources; the approach taken by languages like Python, I guess Lisps, and presumably, even humble Visual Basic. For example, how do you create and destroy windows in VB? Isn't it enough to just DIM a window and forget about destroying it? I think you can do this with OLE objects, anyway. - Nothing is collected, so the programmer knows they have to manage everything themselves; the approach taken by C++, C, and even Pascal, I suppose. A halfway house is just a recipe for confusion, it seems to me. To sum up: - I'm not against the concept of automatically reclaiming resources - I am against singling out memory as a special resource - I am against collecting anything less than 100% of resources Ian